Goto

Collaborating Authors

 Abu Dhabi


Trump's Computer Chip Deals With Saudi Arabia and UAE Divide US Government

NYT > Economy

Over the course of a three-day trip to the Middle East, President Trump and his emissaries from Silicon Valley have transformed the Persian Gulf from an artificial-intelligence neophyte into an A.I. power broker. They have reached an enormous deal with the United Arab Emirates to deliver hundreds of thousands of today's most advanced chips from Nvidia annually to build one of the world's largest data center hubs in the region, three people familiar with the talks said. The shipments would begin this year, and include roughly 100,000 chips for G42, an Emirati A.I. firm, with the rest going to U.S. cloud service providers. The administration revealed the agreement on Thursday in an announcement unveiling a new A.I. campus in Abu Dhabi supported by 5 gigawatts of electrical power. It would the largest such project outside of the United States and help U.S. companies serve customers in Africa, Europe and Asia, the administration said.


The Middle East Has Entered the AI Group Chat

WIRED

Donald Trump's jaunt to the Middle East featured an entourage of billionaire tech bros, a fighter-jet escort, and business deals designed to reshape the global landscape of artificial intelligence. On the final stop of the tour in Abu Dhabi, the US President announced that unnamed US companies would partner with the United Arab Emirates to create the largest AI datacenter cluster outside of America. Trump said that the US companies will help G42, an Emirati company, build five gigawatts of AI computing capacity in the UAE. Sheikh Tahnoon bin Zayed Al Nahyan, who leads the UAE's Artificial Intelligence and Advanced Technology Council, and is in charge of a 1.5 trillion fortune aimed at building AI capabilities, said the move will strengthen the UAE's position "as a hub for cutting-edge research and sustainable development, delivering transformative benefits for humanity." A few days earlier, as Trump arrived in Riyadh, Saudi Arabia announced Humain, an AI investment firm owned by the kingdom's Public Investment Fund.


Bacteria-inspired robot uses 12 spinning flagella to roam underwater

New Scientist

An underwater robot can delicately propel itself in any direction with its 12 flexible arms, inspired by the flagella of bacteria. Its creators claim it can carry out underwater inspections without endangering humans or wildlife, as propeller-driven robots would. Flagella are tiny, hair-like protrusions found on many bacteria that can spin clockwise or counterclockwise to create propulsion. "[Bacteria] have something called a biological motor, which rotates this elongated structure, and this elongated structure produces thrust, and that's how bacteria is propelled," says Anup Teejo Mathew at Khalifa University in Abu Dhabi,โ€ฆ


UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

arXiv.org Artificial Intelligence

This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 888k training instances and 35k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.



VisMin: Visual Minimal-Change Understanding Saba Ahmadi Le Zhang

Neural Information Processing Systems

Fine-grained understanding of objects, attributes, and relationships between objects is crucial for visual-language models (VLMs). To evaluate VLMs' fine-grained understanding, existing benchmarks primarily focus on evaluating VLMs' capability to distinguish between two very similar captions given an image. In this paper, our focus is on evaluating VLMs' capability to distinguish between two very similar images give a caption. To this end, we introduce a new, challenging benchmark termed Visual Minimal-Change Understanding (VisMin), which requires models to predict the correct image-caption match given two images and two captions. Importantly, the image pair (as well as the caption pair) contains minimal-changes, i.e., between the two images (as well as between the two captions), only one aspect changes at a time from among the following possible types of changes: object, attribute, count, and spatial relation.



A Benchmark for Parsing Ambiguous Questions into Database Queries

Neural Information Processing Systems

Practical semantic parsers are expected to understand user utterances and map them to executable programs, even when these are ambiguous. We introduce a new benchmark,, which we hope will inform and inspire the development of text-to-SQL parsers capable of recognizing and interpreting ambiguous requests. Our dataset contains questions showcasing three different types of ambiguity (scope ambiguity, attachment ambiguity, and vagueness), their interpretations, and corresponding SQL queries. In each case, the ambiguity persists even when the database context is provided. This is achieved through a novel approach that involves controlled generation of databases from scratch. We benchmark various LLMs on, revealing that even the most advanced models struggle to identify and interpret ambiguity in questions.


PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation

arXiv.org Artificial Intelligence

Segment Anything Model (SAM) demonstrates powerful zero-shot capabilities; however, its accuracy and robustness significantly decrease when applied to medical image segmentation. Existing methods address this issue through modality fusion, integrating textual and image information to provide more detailed priors. In this study, we argue that the granularity of text and the domain gap affect the accuracy of the priors. Furthermore, the discrepancy between high-level abstract semantics and pixel-level boundary details in images can introduce noise into the fusion process. To address this, we propose Prior-Guided SAM (PG-SAM), which employs a fine-grained modality prior aligner to leverage specialized medical knowledge for better modality alignment. The core of our method lies in efficiently addressing the domain gap with fine-grained text from a medical LLM. Meanwhile, it also enhances the priors' quality after modality alignment, ensuring more accurate segmentation. In addition, our decoder enhances the model's expressive capabilities through multi-level feature fusion and iterative mask optimizer operations, supporting unprompted learning. We also propose a unified pipeline that effectively supplies high-quality semantic information to SAM. Extensive experiments on the Synapse dataset demonstrate that the proposed PG-SAM achieves state-of-the-art performance. Our anonymous code is released at https://github.com/logan-0623/PG-SAM.


Ground Penetrating Radar-Assisted Multimodal Robot Odometry Using Subsurface Feature Matrix

arXiv.org Artificial Intelligence

Localization of robots using subsurface features observed by ground-penetrating radar (GPR) enhances and adds robustness to common sensor modalities, as subsurface features are less affected by weather, seasons, and surface changes. We introduce an innovative multimodal odometry approach using inputs from GPR, an inertial measurement unit (IMU), and a wheel encoder. To efficiently address GPR signal noise, we introduce an advanced feature representation called the subsurface feature matrix (SFM). The SFM leverages frequency domain data and identifies peaks within radar scans. Additionally, we propose a novel feature matching method that estimates GPR displacement by aligning SFMs. The integrations from these three input sources are consolidated using a factor graph approach to achieve multimodal robot odometry. Our method has been developed and evaluated with the CMU-GPR public dataset, demonstrating improvements in accuracy and robustness with real-time performance in robotic odometry tasks.